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Abstract 

Perron-Frobenius theory treats the existence of a positive eigen-vector associated with the prin- 
cipal eigen- value A* of a non-negative matrix, say Q. A simple method for approximating this 



eigen-vector involves computing the iterate A^^Q^" , for large n. In the more general case that Q 
pL^ ■ is a non-negative integral kernel, an extended Perron-Frobenius theory applies, but it is typical that 

^\ . neither the principal eigen-function nor the iterate X^"Q^"'' can be computed exactly. In this setting 

we propose and study an interacting particle algorithm which yields a numerical approximation of the 
principal eigen-function and the associated twisted Markov kernel. We study a collection of random 
■ integral operators underlying the algorithm, address some of their mean and path-wise properties, 

, and obtain Lr error estimates. Examples are provided in the context of a classical neutron model 

studied by Harris, a Bellman optimality equation and a rare event estimation problem. For the rare 
event problem we show how the proposed algorithm allows unbiased approximation of a Markov 
importance sampling method by conditional simulation. 



00 ■ 1 Introduction 

I Oil ^ state space X consider a bounded function G : X ^ IR+ , a Markov kernel AI and the kernel Q given 

by 

CN ■ 

Q{x,dx') -.^ G{x)M {x,dx') . 



In an appropriate function space setting and under some regularity assumptions. Q has an isolated, real, 
maximal eigen- value A^,, with which is associated a positive eigen-function hi,, 

QiK) = X,hi„ (1) 

where for a function if on X, we write the operator Q (if) (x) := J Q (x, dx') Lp {x'). 

When X is finite, A* is the Perron-Frobenius eigen- value. Treat ment o f the e xistence of A* and /i^, 
outside of settings in which X is finite dates back to the work of iHarris |l963l |. where Q arose as a 
conditional moment measure associated with a branching process. 

There are many other situati ons in which /i^ appears, of ten related to the large deviations theory of 



Markov chains, see for example |Nev and Nummelinl . |l987|, where Perron-Frobenius theory is central 



If (X„;n > 0) is a Markov chain with transition kernel M, initialized from ATq ~ x, J7 an appropriate 
function and G{x) — e"^^^'> for a particular value of a, then it is only and explicitly through ft-* (a;) that 
the initial condition enters Bahadur- Rao- type asymptotics associated with partial sums X]p=o ^i-^p) 
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Kontoviannis and Mevij . l2003j . In a related manne r, is used to define an optimal change of mea sure 



in methods for estimating rare event probabilities 



Buckle w el 



al 



1990 



Dupuis and Wand . I200a | and 



Borkar and Mevn 



200 



ifln 



all of these 



appears as a value function in risk-sensitive control problems 
situations another object of interest, both analytically and computationally, is the "twisted" Markov 
kernel: 

P4^,d^'):=^%^^i:M£l. (2) 



The work presented here addresses the problem of numerically approximating hi, and Pi,. 

Of course the eigen-function equation ^ is just one side of the story. Accompanying hi, is an eigen- 
measure, which can be normalized to a probability measure ry*. 



where for a measure ry, we write the operator rjQ (■) :~ 
studied the non-linear operator on measures 



(3) 



/ T] (dx) Q (x, •)• bel Moral and Miclol |2003 | 



$ : ry 



VQ 



(4) 



(where 1 is the unit function on X). Under regularity assumptions, $ is contractive and 77^ arises as its 
unique fixed point. In deed integrating both sides of (|3]) yi elds r]i,Q (1) = so that $ (r ?^) = ry* is then 
just a re- writing of ([3]) . 



Del Moral and Miclo 



2003l | (see also 



Del Moral and Doucet 



2004] ) suggested and 



analyzed an interacting particle algorithm to approximate 77^ and A*, using <f>. The present work builds 
from that approach by suggesting how one may additionally approximate hi, and Pi,. Implementation of 
the method we propose requires little more than the ability to sample from AI and evaluate the density 
of Q with respect to a dominating measure. 

We study various objects which underlie the algorithm and then go on to illustrate some of its 
properties with examples. In some applications computing hi, is the main ob j ective and we provide 

1963l | and a particular 



Harris 



illustrations in the context of a neutron model from the original work of [ 
average-cost Bellman equation. However the proposed algorithm allows us to go somewhat further 
because the approximation of Pi, which we obtain is easy to sample from. Several method s for Markov 
chain rare event estimation [see for example iBucklew et al.l . Il990l . iDupuis and Wand . |2005| hinge on the 



ability to compute the eigen-quantities and sample from P^, so their practical applicability is usually 
limited t o quite restrict i ve ca ses, such as when X is finite or (Xn) are actually independent. In the 
setting of lBucklew et al.l 1990f , we demonstrate the possibility of using the particle algorithm to perform 



unbiased conditional importance sampling for these rare event problems. 



2 Overview 

2.1 Notation and conventions 

Let X be a state space endowed with a countably generated tr-algebra B and let C be the Banach 
space of real-valued, ;B-measurable, bounded functions on X endowed with the infinity norm ||/|| := 
supj.gx l/(-'^)l- For a possibly signed measure 77, a function ip, and a possibly signed kernel K we write 
fjL{(p) := / ip{x)fj,{dx), K{ip){x) :~ J K{x^dy)ip{dy), and fJ-K{-) := J fi{dx)K(x, ■). The collection of 
probability measures on (X, B) is denoted by V and the total variation norm for possibly signed measures 
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is denoted ||77]| := sup^.|^|<]^ |?7((y9)|. The operator norm corresponding to C is 

|||A'|||:= sup \\K{^)\\. 

y:|<p|<l 

The n-fold iterate of K is denoted by JsT'"-' and for (A'„; n > 1) a collection of integral kernels and any 
< p < n, we write 

:= Id, p = n, Kp^n ■= Kp+i ■ ■ ■ Kn, n > p. (5) 

Let G : X ^ (0,oo) be a i3-measurable, bounded function and let AI : X x B ^ [0, 1] be a Markov 
kernel. Then define the integral kernel Q{x,dy) := G{x)M{x, dy). As Q is non-negative we have 

|||Q|||=supQ(l)(x)=supG(a;), 

and IIIQIII < oo due to G being bounded. The spectral radius of Q as a bounded linear operator on C is 

e:= lim |||Q(")r/" 

where the limit always exists by sub-additivity. 

For any sequence (a„; n > 1) and £ > p, we take nn=« o^n = 1 by convention. The unit function on X 
or Cartesian products thereof is denoted by 1. We will write the indicator function ![•] or sometimes I^i 
for a set A cX. 

2.2 The eigen-problem and the algorithm 

Unless stated otherwise, we will assume throughout 

(H) there exists a probability measure v such that for all x, Q{x, •) is equivalent to ly. There exist 
constants < e~ , < oo such that the corresponding Radon-Nikodym derivative, denoted by 
q{x,x') :— ^^^^^-^{x') satisfies 

< q{x,x) < e^, Vx,x'' G X. 



In some places we it will be convenient to use the implication of (H) 

e-i/(-) < Q(a-,-) < e+J^(-), Vx e X. 

Remarks on this assumption are postponed until the end of Section [51 

In Section [5]we address some elements of a generalized Perron-Frobenius theory for the kernel Q, built 
from results of 



Nummelin 



2004| . Theorem [T] deals with the existence, uniqueness and characterization 



of an eigen-measure 77* € T' and a positive eigen-function hi, Cz C satisfying 



T]*Q = A*r/*, 



' {K) = X^K, (hi,) = 1, 



(6) 



where A^, is the principal eigen-value of Q. Furthermore, we obtain a multiplicative ergodic bound of the 
form 



ArO'"'-/i*®^.||| <2p" ( — ) , n>l, 



(7) 



where p := 1 — e j < 1. The bound (jT]) suggests that iterates of Q may play a useful role in a method 
for approximating hi,. Indeed when X is finite this is the basis for the well known "power method" 
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Horn and Johnson 



IQQoj alluded to in the abstract. However, direct exploitation of ([7]) for more general 



X is typically not possible because in many situations the iterates of Q cannot be computed exactly, and 
Xi, is also unknown. 

The first main contribution of the present work is to propose the algorithm below. It may be regarded 
as an approximation of the afore-mentioned power method, with computations performed on a stochastic 
grid which is defined by an interacting particle system. The parameters of the algorithm are: N , the 
particle population size; n, the (half) time-horizon; and /.t, an initial probability distribution. As we shall 
see, the values of N and n influence the accuracy of the approximation and the choice of /i turns out to 
be somewhat unimportant. A more precise probabilistic specification of the algorithm is given in Section 

El 

Algorithm 1 Particle method for computing principal eigen-quantities 
Forward 



Initialization: 

Sample (Co),^^^ 
For p = 1, 2n, : 



Sample {C,^^t, 

Backward 

Initialization: 

Set h2n.2n{x) = 1, X G X 

For p = 2n — I, ...,n, : 

^ q{x,Ci+i) 



N iid 



J2f=i G (Cp-i) 



Set h^2ni^) — - ^p+l,2n(Cp+l)- xeX 

j = l L^i=l l[Qpj Qp+l) 



We will take the random function 2n approximation of /i* and the random kernel 



as an approximation of P^,. Note that, if so desired, each hp2n appearing in the algorithm can be 
evaluated at any point x € X, but each step of the backward recursion actually requires evaluation of 
hp_^i 2n only on the random grid {Cp+i;* = 1, ■■■,N^. Further note the subscripting in P^2n) -"^^^ 
semigroup index notation of ([5]), and pertains only to the particular kernel in ([5]). Occurrences will be 
kept to an absolute minimum. 



2.3 Summary of the results 

The second main contribution is to study various operator functionals underlying the algorithm. Ul- 
timately, our objective is to validate the algorithm by obtaining L,., (r > 1), bounds on the errors 
hn,2n{^) ~ h*{x) and 2n)('''' A) — P^,{x, A), for X G X, A € B, in terms of N and n. We now summarize 
and collate some of the main developments, the reader is directed to the appropriate sections for the full 
statements. 
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Particle Approximations 

In Section [S] we introduce collections of random probability measures (rj^ ; ?t- > O) , random non- negative 
integral operators (Q^ ',n> 1) and positive random variables (A^ ; ?^ > O) defined by 



1 ^ 



TV 

i=l 



where the sequence (C^) is as in the algorithm and $ as in equation We observe in Lemma [S] that, 
mirroring ([S]), they satisfy for any A'^ > 1, any n > 1 and < p < n, 

N^N _ -kN N (iN \ _ \NiN N / iN ^ _ i /q\ 

Vp — '^p Vp+l7 \'l'p+l,n) — '^p ''■p,nJ Vp {'>'p,n) — ^-i Wj 

where the random positive functions {h^^ are implicitly defined by the algorithm. Mirroring ([5]), we 
see that the approximation of the twisted kernel ([5]) may be expressed as 

^n~l"-n-l,2n\-'') 

and from © that it is indeed Markov, for any n and N. 
Lack of Bias 

In Proposition [5] of Section [S] we find that iterates of the random operators (Q^) can be used to obtain 
unbiased estimates of iterates of the underlying operator Q, in the sense that for any fi' £ V, (possibly 
different from fi used in the initialization of the algorithm), any ip € C and A'^ > 1 and n > 0, 

Ejv yQ^,„ (^)] - A^'Q(") (^) . (10) 

where E^v denotes expectation with respect to the law of the A'^-particle system. Thus we may interpret 
([7]) as providing a multiplicative ergodic theorem (MET) in mean for the random operators (Q^), 
because ([7]) combined with (fTU)) immediately gives a bound of the form 

sup sup |A-"E^ [/.'g^„ (^)] - fi' (^)l < ll^il p"2 (—) . (11) 

IJ.'GVN>1 \e / 

where p is as in (O. 
Path-wise stability 

It is natural to look for some form of corresponding sample path result, and in Theorem[2]it is established 
that 



sup sup sup 

li'eV N>1 tdGfiiv 



\P=0 / 



{uj)<Mp-2['-], (12) 



where fl^ denoted the sample space underlying the algorithm with population size N and p := 1 
{e-/e+f < 1. 
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It is well known that under (H), the semigroup associated with the non-linear operator <i>, i.e. 

'n(«) 



$ o • • • o $ : // — , , , , , 
' . ' ^ /i'Q(")(l)' 

n times 

is exponentially stable with respect to initial conditions. In Proposition [T] of Section 2] it is shown that 
this property can be derived from the MET and we find: 



sup 



^'Q(«) (^) 



//'Q(") (1) 



< ll^llp"4(-3 



which is a slight sharpening of existing results, e.g. [Del Moral and Doucetl . |2004 Theorem 1], in the 
sense that we obtain the rate p as opposed to p. Some manipulation of ([T^ (again see Theorem[2) yields 



sup sup sup 

p'e'P N>i uien^ 



^^'QL (1) 



vi: if) 



H<ll¥'ll/5"2 



(13) 



Thus we find that the corresponding random (and generally path-wise inhomogeneous) non-linear semi- 
group also enjoys exponential stability, on sample paths, uniformly with respect to N. 

Lr error estimates 

The type of uniform path-wise convergence displayed in (jl2|) and (|13p plays an important role in the 
proof of Theorem [3] in Section |51 where the above ingredients are gathered together in treating the stated 
primary objective of quan tifying the error i n app roximations of /i^,, and P^,: Theorem |31 which in part 
uses some arguments from 



Del Moral et al 



£PI 



2010j, establishes Lr error estimates of the form 



sup Ejv 



^x) - K{x)\ 



1/- ^ Br,h 



N 



sup sup E TV 



P{n,2n) ^) ~ P* ^) 



< — — + p Op, 



TV 



(14) 



(15) 



for any r > 1, some finite constants C/,. and Cp which depend only on e^,e^ and finite constants 
Br^h,Br^p which depend additionally on r. The errors are thus controlled in A'^ and n, and in these 
bounds there is no dependence on the measure fi used in the initialization of the algorithm. 



2.4 Preliminary remarks 

• The "f orward" part of the algorithm has been suggested by 



Del Moral and Miclo 



200; 



3, 



Del Moral and Doucet 



20041 in order to approximate r]^, and A* using the empirical probability measures [r]^)- Defining 

n — l 

^n--=i-T.^OgX^, (16) 



they proved estimates of the form 



E 



N 



Vn (V)-'7*(<P) 



p=0 



E 



N 



!/'■ / B. 

< C 



N n 
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Del Moral and Doucet 



for some constants C < cx3 and p < 1 (see [Del Moral and Douceti . |2004 Theorem 2] for precise 
details) . 



2004] addressed the case that the function G may vanish, and 



a weaker "multi-step" version of (H) . Similar techniques as used therein can be applied in the 
present context, but involve notational complications. 

• The type of recursion in the backward part of the algorithm is implicitly present (albeit expressed 



some what differently) i n oth er interacting particle algorithms, see for example 



2010] and 



Douc et al 



Del Moral et al. 



201l| . in the context of non- linear filtering/smoothing. The main novelty 



of the present work stems from finding the connection between the backward recursion and /i^,, P*, 
and the study of the objects which then appear. 

• The assumption (H) rarely holds when X is non-compact, but allows a relatively straightforward 
treatment of the eigen-problem and the particle algorithm. The eigen-quantities of interest exist 
and have analogous properties under much weaker assumptions, and an MET similar to Theorem 
[T]can be obtain ed for non-compact X in a weighted oo-norm setting u nder quite flexible Lyapunov 



drift conditions |Kontoviannis and MevnI . l2003llWhitelev et al.l . |2012| . The details, however, would 
necessitate a more complicated presentation. Obtaining finite N error bounds of the form (|14p - 
(jl5[) under assumptions much weaker than (H) seems challenging and is essentially linked to the 
well recognized difficulty of proving time-uniform convergence of particle filtering methods on non- 
compact spaces. 

The remainder of this paper is structured as follows. Section [3] deals with existence and characterization 
of eigen-quantities and the MET. Section 2] considers some deterministic functionals which give finite n 
approximations of the eigen-measure/function and the twisted kernel. Section [5] is devoted to various 
properties of the particle approximations, and the L,. error results are housed in Section [51 Section 
[7] contains three illustrative examples. These examples should be mostly accessible after completing 
Section [51 and the reader more interested in the applications may choose to jump forward and preview 
these examples before reading the other sections. Some possible extensions are discussed in Section [8] 



3 Multiplicative ergodicity 

We start with a largely self-contained account of properties of Q and implications of (H) which set up 
the eigen-problem of inte rest. Theorem [T] is the main result of the section, establishing the MET by 
using some results due to iNummelinI 2004| . The contents of this section are therefore mostly not new, 
but many of the details and techniques are directly relevant to the approximations studied in subsequent 
sections and are therefore included for purposes of exposition. 

We will take as in (H) as an irreducibility measure and define B'^ := {A G B; i^iA) > 0}, £+ := 
{(p G C; v{'p) > 0}. The semigroup {0(");n > 0} is called i/-irreducible and aperiodic, respectively, if 



Q(")(x, A) > 0, V.T eX,AeB+, and 



n=0 



lim inf I 

71— )-00 



Q(") {x, A)>o\ =1, for each xeX,AeB+. 



A function s E £+ and measure rj E V are called small (for Q) if for some m > 1, it holds that 
(5(™'(x-, A) > s{x)t]{A) for all X e X, A e B. For notational convenience define s~ : X ^ M_|_, s+ : X ^ 
M-i- by s~ (x) = e~,s^(x) — e+,Va;, respectively. Clearly under (H), s~ and v are small for Q and the 
semigroup {Q(");n > 0} is then j/-irreducible and aperiodic. 

We begin with a lemma that establishes uniform bounds on ratio functionals involving iterates of Q. 
Bounds of this form will appear throughout the paper. 
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Lemma 1. For any ij.' G V and Lp E 



mt mt , , ^ ^ > — > 0, sup sup , . , , 



< — < oo. 



(17) 



Proof. Under (H), 



Q^"^ (y) (y) ^ V ^ Y ^ 1 



then integrating in the numerator with respect to /i' and re-arranging gives the infimum bound in (|17p . 
The proof of the supremum bound is similar. □ 

We now turn to asymptotic properties of the iterates Q^"\ From the minorization part of (H) and 
SMper-additivity, the limit 



A^, := lim — logi^Q*"^s = sup — log i^Q'"^^ 

rn-oo n n>l n 



(18) 



exists. Define 



A* := cxp(A*), 



to be the generalized principal eigen-value (g.p.e.) of Q. Due to the irreducibility, A* does not depend on 
the pa rticular choice of minoriz i ng me asure i/ or small function function which appear in its definition. 



From 



Kontoviannis and MevnI . l2003l Theorem 3.1], we have that for any s E £+. 



oo, 



Vx e X, < A. 



< oo, for — almost all x £ X, <; > A^,. 
In the present context, the g.p.e. coincides with the reciprocal of the convergence parameter of 



Nummelin 



2004 Section 3.2]. Connecting with the terminology of the latter, the irreducible kernel Q is called A^- 



recurrent if for any s € Yl^=o ^* "Q*-"-* (s) (a;) = oo, u — a.e. 
The following lemma prepares for Theorem [T] 



Lemma 2. We have 

£- < e = A, < 
and therefore Q is A* -recurrent. 



inf inf > 0, 

u'eVn>o A" 



(19) 



Nummelin 



2004 p.96], under (H) the kernel 



Remark 1. Following the terminology and arguments of 
Q is then additionally uniformly A^^-recurrent. 

Proof, (of Lemma [2]). The upper and lower bounds on the spectral radius ^ follow from (H), because 
for any n > 1 and a; G X we have e~ < [Q'"^ (l)(x)] To verify that A^, coincides with ^, write 



1 sup,Q(")(l)(x) 



:/g(")(s-) 



1 sup.g(-)(i)(.)_i 
1 e+ 1 i/g("-i'(i) 1 

< - log — + - log , + - 



log e — > as n oo . 



It remains to verify the uniform lower bound in (jl9p and thus the A*-recurrence. A key feature of the 
majorization part of assumption (H) is that by su6-additivity we are assured of the existence of: 



A+ := lim ilogi/g(") (s+) = inf -logi^g^") (s+) 



n>i n 



(20) 
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But from the definitions of s+ and 



1 



■logz/Q(") (s 



1 



■ log lyQ 



(") 



log 



j/g("'(l) e+ 



- log — 



j/g(")(l) e- 



(21) 



so taking n — > oo we find that 
imply 



A*, and then (PT|) together with the right- most equality in (PD|) 



so 



-logi/Q^"' ,-)_A, >— log — , 



)(") 



Equation (fT9|) then holds as 



A''Q<"'(1) 



> 



i^Q<")(l)e+ - (I+p' 

Now consider the family of potential kernels, {Ue; £ [A^, oo)}, 



— ^ — - > — > 0. 

A" - e+ 



for all jj' Cz V , and this implies A*-recurrence. 



□ 



Ue:=J2e—'{Q- 



s ® V 



An) 



n=0 



where the convergence of the sum, in the operator nor m, is ensured by the A^,-recurr ence of Q and is 
straightforward to verify using the inversion argument of lKontoviannis and Mevnl |2003l Proof of Lemma 
3.2], noting that as per Lemma [51 the spectral r adius of Q coinci des with the g.p.e., ^ ~ A^. We now 



Nummelin 



state the MET. The proof uses various results of 
Theorem 1. The measure and function pair defined by 



2004|. 



satisfy 



V* 



'^Ux, (1)' 



£- e+ 

— < hJx) < — , Vx e X, 



(22) 



(23) 



and they are respectively the unique probability measure and v-essentially unique positive function satis- 
fying 

r]^Q = A*??*, Q {h^) = X^h^, ry* (K) = I. (24) 

Furthermore is ergodic with a unique invariant probability distribution, denoted by -K-^, such that 
dn^^/drii, — h^. For all n > 1, 



< 2r 



(25) 
(26) 



where p := 1 — (e /e"''). 



Proof. We have have seen that (H) i mplies irreducibility and aperiodicity and, as per Lemma [51 Q 
is A*— recurrent. By 



Nummelin . 



2004 Theorems 5.1 and 5.2], i^Ux^ and Ux^,{s ) are respectively the 
unique measure and i/-essentially unique non-zero function satisfying 

yUx^Q^KyUx^, QUxA^') ^ ^*UxAs'), vUx^{s-)^\. (27) 
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Under (H) we then have from ([77)) that 



< ^ = Y^Ux^ [s-) < C/a. {s-)ix) < ^i^C/a. {s-) = '—<oo, Wx, (28) 

which vahdates the normahzation in (P^ . thus estabhshing ([M|). The uniqueness properties transfer 
directly to rji, and /i*. 

We obtain from (P7)l and ([^5]) the fohowing uniform lower and upper bounds on h^,■. 

A* A* A* V*U\, [s ) A* V*U\, (s ) e+ 

/i^ x) = r < —ly {K) = — — — < — < oo, Va; (30 

A* A* A* Ty^t/A, (s ) e 

so that is established. Furthermore Pi, is then well-defined as a Markov kernel and we readily verify 
that it satisfies a uniform minorization condition: 

, Q{x,dx')K{x') 

P*[x,dx) = , , 

hi,{x)K 

^ v{h-i,) v[dx')h-i,{x') 

~ K{x)Xi, v{K) 



Ux, {■s-){x)X, 
e 



> —iy{dx')Ux^ (s-) {x'), Vx, 



where vUx^ (s~) = 1 and (|28p have been used. Thus P^, is uniformly geometrically ergodic and by inspec- 
tion of the eigen-measure equation its unique invariant probability distribution, de noted by tt^,, is given 
by TT-j r (v?) = ?7* (/i*v) /v* C^*) = 'y* (,hi,(p). Then, again noting that vUx^ (s^) = 1, by [Mevn and Tweedid . 



20091 . Theorem 16.2.4] we have: 

|||Pi")-l®^.l|| <2p", (31) 
where p := 1 — (e^/e+), which establishes ((25)) . Multiplying by /i^ > in ((3T|) yields for any (/> £ £,x E X, 

A7"Q(") (/!,</)) (x) - /i4a;)77, (/i,0)| < 2p"/i4a-) < 2p" (^^^ , (32) 

where ([50)) has been used. By equation (|^^ . /i^ is bounded below away from zero and therefore for any 
(fi £ £, we may have taken 4> := (p/h^, £ £ in (|5^ . Finally noting from (|^9)) that < (e+/e^) \\(p\\, 

the bound of ()26p is established. □ 



4 Deterministic approximations 

In this section we consider various deterministic quantities which will help us connect the algorithm 
with the eigen-quantities. The main result of this section is Proposition (TJ which, amongst other things, 
quantifies the error between hi, and some deterministic functions (/ip „;p < n) defined below. The interest 
in the latter functions is that they will serve as intermediate quantities between /i^ and {hp when we 
come to analyze the particle algorithm in Section |6| Various other important objects defined in this 
section appear throughout the remainder of the paper. 



10 



Define the probability measures (77„; n > 0) and numbers (A„; n > 0) by 

770 :=M, Vn ■■= ^Q{n)(^iy " ^ 1^ A„ 77„(G), n > 0. (33) 
Immediately from ([55]) we have the product formula: 

n— 1 n— 1 

^?pg^"-^Hi) = n = n (34) 

and we note that 

Vn = ^{Vn-l), n>l. (35) 

Straightforward manipulations show that under (H), for any 11 > I, r]n is equivalent to i'. 
Next define the sequence of non- negative functions (/ip.,i;0 < p < 71) as follows: 



Q("-P)(l)(x) 
77pQ("-p)(l) 



/in,n(a;) := 1, -rTv;;3;7r7TT' < p < 7T,,a; e X. (36) 



Remark 2. It should be noted that (rin), (A„) and (/ip,n) depend implicitly on the initial measure fi. 
Under assumption (H) we obtain uniform bounds on these quantities, as per the following Lemma. 

Lemma 3. 

inf77„(G)>0 (37) 

n>0 

inf inf inf hp,n{x) > — - > 0, sup sup sup/ip,„(x) < — < 00. (38) 

">10<P<"^eX e+ n>lQ<p<nx<£X ' £ 



Proof. Assumption (H) implies that G is bounded below away from zero and therefore we have p7p. 
Lemma [T] implies ([35]). □ 

Furthermore, we find that they obey recursive relationships which may be considered as mimicking 
the eigen- measure/function/value equations (j24p . 

Lemma 4. The probability measures {rjn), Junctions {hp^n) o,nd numbers (A„) satisfy 

VpQ = KVp+li Q{hp+l,n) = \hp.n, 7?p (/ip,n) = 1, Q < p < U. (39) 

Proof. The measure equation is just a rearrangement of psp . The function equation is due to the 
definition of (/ip,n) and the product formula ([M)) . as 

The final equality in holds due to the definition □ 
We now introduce the family of Markov kernels {P(p^n) ; 1 ^ P ^ defined by 



P{p,n)ix,dx') 



Ap_i/ip_i_„(x) 



The following proposition shows how (jy^), (/ip,ri), {P(p,n)) can play a role as approximations of 77^,, h-^, 
Pi,, respectively. 
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Proposition 1. For any < p < n, 



with 



\\vn-v4 < p'^c,, 



-h4 < p'^-^-^^'^^Ch. 
PAW < p("~^'^^pcp, 



(40) 
(41) 
(42) 



C, -.^ 4(e+/6-)' 

C, := 2{e+/e-f[l + {e+/e-)+2{e+/e-f 
Cp 2C^(e+/e-)' + C^p-i(e+/e-) 

having no dependence on the initial measure fi. 

Re mark 3. Exponential converge nce of the general form (|40p has already been established in, for exam- 
ple, iDel Moral and Doucetl |2004l | using Dobrushin arguments for a collection of inhomogeneous Markov 



kernels, but the rate obtained there is p 1 — (e^/e+) as opposed to p. The proof of Proposition[T]uses 
the MET bound of equation ((26)) and as may be seen in the proof of Theorem [U the rate p is inherited 
from the uniform geometric ergodicity of as per (j25p . This is the source of the improved rate. 



Proof, (of Proposition [T]) We first treat (gD]) , 



hn -'7*11 



< 



sup 

V:|V|<1 

sup 



mQ(") (^) 
/iQ(") (ip) 



fiQi-^) (1) X^fi{h,) 
AiQ(") (1) 



mQ^"^ if) 
X^p{K) 



< 



- sup 
v-\v\<i 



//g(") (1 
AiQ(") (^) 



Kp{K) 



\^p{K) 

- '7* {"P) 



< 4p" 



where the penultimate inequality follows from two applications of the bound of Theorem [1] Equation 
and the final inequality is due to ([2^. This establishes (P0|) . 
In order to prove (PT|) . we first consider products of the values (A„). We have 
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a: 



n—p 



- 1 



< 



Ar 



ApP 



+ hp(/^*)-'74MI 



1 + 2 



(43) 



where the penultimate inequahty is due to (pS)) of Theorem [T] and (PU)) . and the final inequality is due to 
Integrating and iterating the eigen-measure equation gives A" = r}i,Q^{l) . Then by Lemma 



Q(")(l)(x) ^ 
sup sup — < - 



With the above bounds in hand we now address (|4T|) . We have 



< 



Ar^ 
Ar*' 



1 



Q("-P)(l)(x) 



a:' 



1 



+ 



n"=p Af 

Q("-P)(l)(a;) 



sup sup ■ 

m>l 1/6X 



Q(™)(l)(y) 



Ar 



/i*(a;) 



= 2p' 



(n— p)Ap 



where for the final inequality, (|43)) . (|44)l and (|26|) have been used. This establishes (|4T 
For (j42D . consider the decomposition 



-P*||| !i sup sup 

X i^:|ip|<l 



1 



Ap_i/ip_i_„(a;) 
1 |/ip-i,„(a;) - K{x)\ \Q {Kip) {x)\ 



Ap-l hp^in{x) 

|A.-Ap_i| 1 



K{x) 
IQiKifi) {x)\ 



< Whr, 



hi,\\ sup 



Q{l){x) 



X -^p— l^p— l,n(^) 
«p-l,n - KW sup 



Ap-l 
|A. -Ap_i| 
Ap-l 



< c,,p("-p)^f2 ( — I + 



^p— l,n ('^) 



(44) 



where for the final equality, Lemma [31 the identities Ap = rjp{G), A* = 'i]i,{G), and (pn)) - (|¥T|) have been 
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used. □ 

5 The particle system and Monte Carlo approximations 

This section addresses various properties of the random quantities appearing in the algorithm. The 
main result of the section is Theorem ^ which addresses path-wise properties of some random operators 
which underlie the algorithm. Proposition [5] establishes a lack-of-bias property associated with these 
operators and may be of some independent interest as noted in the remarks thereafter. We begin with 
a probabilistic specification of the particle system. 

For > 1, the particle system in the forward part of the algorithm can be constructed as a canonical 
Markov chain with sample space ■= (X^)'^, endowed with the corresponding product cr-algebra, 
derived from the underlying d-algebra B. The state of the chain at time n > is the coordinate 
projection of cj G fiyv denoted by = . . . , C,^), and taking values in X^. The natural filtration is 
denoted by J>j = cr ((^Q; ' ' ' ? Cn), where the dependence of each C„ and J>j on N is suppressed from the 
notation. The law of the A'^-particle system is denoted by Pjv , and in integral form, the initial distribution 
and transition probabilities of (C„) are given respectively by 

N 

1=1 

where dxn is an infinitesimal neighbourhood of Xn = (x^, . . . x^) G X^. The expectation corresponding 
to Pat is denoted Ejy. 

The idea for the eigen-function approximation in the algorithm is to consider the identity 

/ip_i,„(a;) = [ Q{x,dy)hp^„ {y) 

-^p-i J 



Ap-i J dr], 
1 f dQ{x,-) 



Ap_i J d^{r]p-i) 
dQ{x, •) 



{y)hp,n{y)vp{dy) 



d{i]p-iQ) 



iy)hpAy)Vpidy), (46) 



where the first inequality is due to the definition of the functions (/ip,,i), the second inequality is just a 
change of measure in the integral, and the third and fourth equalities are due to rip{-) = $ {Vp-i) (') = 
^^{G) ^'^'^ definition Ap_i = 77p_i(G). For any x and p, the derivative ^ is well defined under 
(H) because Q{x, ■) is then equivalent to v for any a;, and then also equivalent to rjp. Loosely speaking, 
the backward recursion of the algorithm arises from taking the random measures (??^) in place of [rjp) 
in dini). 

To be more precise, let {Qn) be the collection of random integral kernels defined by 



g^(x,dx') := ^%i^(^')^^(d^'), n>l. (47) 
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It is convenient to recall the semigroup notation in this context: 



Now define 

A^:=r7^(G), n > 0, (48) 
and let (/i^„) be the collection of random functions defined by 



C„(x):=l, h^Jx) := < p < n. (49) 

Recall from Section [51 



■'^n-l"'n-l,2n\-'') 

The following lemma establishes relationships between these objects which may be considered stochastic 
counterparts of the relations of Lemma H] 

Lemma 5. The random measures (?/^), functions {hpn) ' '^^'^ kernels {Qn) satisfy 

VpQp+i=^pVp+i, Q^+i AXn: <(0=1' 0<P<n. (50) 

ji-i 

= n 0<p<n. (51) 

Proof. For the measure equation in ((50)) and the definitions (|47)) - (|48| . 



X^V^+Adx'). (52) 



By iterated application of ([5^ we have 



n-l 

e=p 

where the final equality is due to the convention Q^^ •= ^'d- This establishes ([5T|) . For the function 
equation in (|50p . we have 



(lN \ Qp.nW 



where the final inequality holds due to ([5T|) . The right- most equality in ((501) holds directly from the 
definition of . □ 

Remark 4. The recursion in the "backward" part of the algorithm is a re-arrangement of the middle 
equation in ([50| . 
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5.1 Lack of bias 



Proposition 2. Let fi' E V and let fi^ be an J- q -measurable random measure satisfying Eat ^fi^ (A)] 
fi' (A) for all A Cz B. Then for any tp Cz C, N > 1 and n > 

Eat [/^^Q^„ (^)] = A^'g(") {^) ■ 
Proof. The n = case is trivial. For any (ys G £, n > 1 and x G X, we liave 



E 



i=l 

rfQ {x, •) 



(x, 



{x')^{x')^:i {dx') 



N 



iCn) V (C) 



Q((^) (x), 



(53) 



where tire penultimate equality is due to the definition of the particle transition probabilities (|45p . 
Now consider the telescoping decomposition 



71-1 



p=0 

For each term in the big summation we have 



E 



N 



J~ n 



= 0, 



T 

J 7) 



where the final equality is due to ([5^ . For the remaining term, E^r [(/i^ — /^') Q*-"^ (</?)] = by assump- 
tion of the proposition. □ 

Remark 5. We highlight two interesting instances of initial measures in Proposition [21 The first is the 
degenerate case in which /i^ = /i', for some fi' E V other than fi: in this case we note that there is no 
bias (in the sense that the Proposition [5] holds) when the functional fi^Qg^ {(p) involves a deterministic 
initial measure, other than that used to initialize the particle system. The second case is that in which 
fi' ~ fi and ji^ = tiq . In this case we have 

V^Qti^) = V^iG)l lv^idx,)^^^{x,)Q^^A^){^i)vndxi) 



VoiG) Voidxo) 



1 ^ 



rlAv) (xi)<(dxi) 



1=1 

%^(G) j QlA^) (xi),yf (dxi) 

p=0 
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where the final equahty can be verified by a simple induction. So in this case, we recover from Proposition 



Hthe equality En [U^Z^ v!!(G)v^ (^) 



particle algorithm [Del Mora] 



2004 



fiQ^""^ {(p), which is well known for the "forward" part of the 
Chapter 9]. 

Remark 6. A number of generalizations of Proposition [5] may be obtained quite directly. Consider some 
integral kernel Q different from Q and which, for simplicity, satisfies Q{x, •) ^ Q{x, ■) for all x. Then 
defining 

rf* [Vn-l) 

one can establish by similar arguments to those in the proof of Proposition [5] that 



E 



N 



i.e. that the particle system defining (?/^) and whose law involves Q can be used to obtain unbiased 
estimates of product formulae involving Q. In turn, this might be of interest both in the present context 
and in other applications of particle systems, when the aim is to approximate ratios of the form 

/i'Q^") (1) 
AiQ(") (1) ' 

although further details are beyond the scope of the present work. The time-homogeneity can also easily 
be relaxed, of course under appropriate domination assumptions. 

Remark 7. A related lack of bias result is treated in Section 17.31 Proposition [71 in the context of a 
sampling problem involving the twisted kernel approximations {P^n} ■ 

5.2 Path-wise behaviour of the random operators 

Under assumption (H), we find that the random operators satisfy path-wise, a regularity condition of a 
similar form. 

Lemma 6. The operators (Q^) satisfy 

(•)£" < Qn •) < e+«n (■), Vx e X, n > 1, iV > 1, 
where is the random finite measure: 



N I 



dv 



-ix) 



and e , e+ are the deterministic constants in assumption (H). 



Proof. Since Q{x, •) is equivalent to z/, then $ (Vn-i) too, and it is straightforward to check that 
assun 
have 



assumption (H) implies that ^^^'^n — ^i^) bounded above and below away from zero in x. We then 



dv 



Ad^ (77^, 

A d<^ (ry4^_i 

dv 



-{x')Vn{dx') 



A d^ 



i^^nidx'), 



17 



The proof of the lower bound is similar. 



□ 



The following proposition provides a generic result on iterates of non-negative kernels, which will 
serve multiple purposes throughout the remainder of the paper. 

Proposition 3. Let (/i„;n > 1) 6e a collection of possibly random, non-negative integral kernels, and 
suppose that for a collection of possibly random, finite measures > 1) and positive, bounded func- 

tions {S^,S^;n > I), 

S;;{x)i^n{-) <Kn{x,-) <S+{x)iyn{-), VxeX,n>l. (54) 



The 



where 



Sn sup 



5+ (a 



Furthermore, for any possibly random probability measure j] and 



sup 

x<£X 



Ko,ni'P){x) Ko^n{l){x) r]Ko,n{'P) 



< \\ip\\2CsY[pp 



(55) 



where p„ 1 - ( inf^gx |t|fy 



2d Cs := sup„>i Sn- 



Rema rk 8. We approach the proof of this proposition using a decomposition idea of lKleptsvna and Veretennikov 



2008| , a technique which they demonstrated to be useful in the analysis of non-linear filter stability 



on non-compact state-spaces. We won't exploit the full genera lity of this kind o f decomposition (it is 



useful under conditions much weaker than (H) - see for example |Douc et al.l . l2009l | . again in the filtering 
context) and we choose to take this approach because it yields a short and direct proof, which is sufhcient 
for our purposes. 

Proof, (of Proposition [3]) . The uniform bound of ([55]) holds directly under the assumptions of the 
proposition. 

We write K^^ {x,y,d{x' ,y')) Kn{x,dx') Kniy,dy') and v®'^ {d{x,y)) := Vn{dx)vn{dy) . Under 
the assumptions of the proposition we have for any {x,y) E and measurable A C such that 



K„ ix,y,A) 



< 



KS' {x,y,A)^S7Ax)Sniy)'yr (A) 



Sn i^)Sn (y) 



S+ix)S+{y) 
< pnK^'{x,y,A). 



(56) 
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Furthermore, 



77i^o,n(l)?yA'o,„(l) 
A'o,„(l)(x) \iS^ (E> v) Kfl ((^ ® 1 - 1 ® ^)| 



{5x <E) -q) Ko^n {(fi (E) I - I <E) if ) 



< 211^11 

< 211^11 



sup S'j 
sup Sf 



{S^ (8) ?7) Xo,„ (1 (8) 1) 

n 

p=i 



(57) 



where the equah tv in (1571) is due t o the decomposition technique of iKleptsvna and Veretennikov 
p. 422] [see also 



Douc et al 



2008 



2009L Proof of Proposition 12], and for the final two inequalities (|55p and 



have been used. 

The first place in which we apply Proposition [3] is in the proof of the following theorem. 



□ 



Theorem 2. The following path-wise, uniform bounds hold for the random operators (Q^) and the 
corresponding non- linear semigroup. For any n > 1 and ip C, 



sup sup sup 

fi'ev N>i weOjv 




M<2|1^I1p" 



(58) 



sup sup sup 

ii'ev N>i uefiN 



t^'QL (^) 



where p = 1 — (e /e^)^. 
Proof. From Lemma [5l 



p=0 



N 



(a.) < 211^11 



o^„(l) 



'/O '^0,n 



(59) 



(60) 



Thus ([55]) holds due to Lemma [5] and Proposition [3] applied with 77 = ?7q , /\„ = (5„, = Q!„ and 
5+ = e+, 5^ = are constant. Dividing through by (ft.^„) in ([55)) . again noting ([SD]) and using 



sup sup ^ 
which also holds by Proposition [31 we establish ([5^ . 



< 



(61) 
□ 



6 Lr error estimates 

This section deals with establishing L,. error estimates which validate the algorithm; Theorem [3] is the 
main result of the section. A core component in the proof of that theorem is Proposition [Sj which is 
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stated after the introduction of some further notation and Lemma [T] 

Consider the collection of "backward" random kernels (i?^) defined by 



r>N r J /\ N /] n dQ{x',-) 

[Vn-l) 



and with a slight abuse of convention, write 



R 



N 

■p+1' 



p < n. 



The interest in these quantities is that, in the context of the Lr error estimates which are the focus of 
this section, they provide a convenient way to express the functions (/i^„) and share path-wise stability 
properties with [Q^) ■ Indeed by a simple induction it can be shown that for any ip £ C, 



(62) 



Remark 9. Each kernel i?„ is equal, up to a scaling factor of i]^^_^{G), to a certain "backward" Markov 
kernel used in the analysis of iDel Moral et al. I |20inl |. In contrast to the latter work, we are centrally 
concerned with emphasizing the relationship between (Q^„) and the underlying semigroup (Q(")). In 
view of ([5^ and Proposition [21 we therefore prefer to deal with [R^), but only for cosmetic reasons. 

Given the close relationship between (Rn) and {Qn) i it is perhaps not too surprising that the former 
satisfy a condition similar to that in Lemma [51 

Lemma 7. The operators (i?^) satisfy 

€-i{-)Pn{^)^~ < Rnix, ■) < e+/3^(x)77,li(-), Vx e X,n > l,iV > 1, 
where (3^ is the random, positive and bounded function: 



dv 



-ix) 



and e , e+ are the deterministic constants in assumption (H). 
Proof. From definitions. 



^^(x)77„_i(da; ) 



A d^ iv':^,) 

dQ[x' , •) dv 



{x)rj:!_,{dx') 



dv 



d^{v^-,) 



The claimed positivity and boundedness of /3„ follows from (H). The proof of the lower bound is 



similar. 



□ 



It is well known that under (H) and variations thereof, one can obtain time-unif orm Lr est i mates 
for errors of the form rj^ (ip) — rjn (<p). We will make use of the following result, due to lDel Morall 2004 . 
Theorem 7.4.4]. The proof is omitted. 



Proposition 4. For any r > 1 there exists a universal constant Br such that for any ip C, the following 
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time uniform estimate holds 



sup Eat \t]^ (ip) -iin{^)\ 

n>0 



^1'' „ Br /e^ 



We need a further definition. Consider now the functions ((/>„) and their random counterparts 
defined by 

dQ {x, ■) 



and note that under (H), 



dQ [X, •) , jY , 

dTJnQ 



{x') , n > 



sup sup 1 0„ (a;, a;') I < — , sup sup sup | (x, < — . 

n>Ox,x' e N>ln>Ox,x' £ 



(63) 



Furthermore, we then Irave from definitions that 



1 



dQ{x,-) ^AT ?N AT / , ?N 



(64) 



'In ^n,p+l \^) 

where the final equality is due to (p^ . 

Proposition 5. For any r > 1 there exists a universal constant Br such that for any ip € C and N > 1, 

r-il/r 



sup sup Eat 

p<n x^X 



where 



C = 



1-/5 



and p is as in Theorem\^ 

Remark 10. The treatment of the term T^^ in the proof uses some arguments from 
2010 . Proof of Theorem 3.2], with variations customized to the present context. 



Del Moral et al. 



Proof, (of Proposition 5) From the identities 

{phi:) jx) _ r^^ {x, ■) Q^„(l)] _ 7?jr<, (x, •)] 

-^^1 VpQp.ni^) Vn^n^pi^) 

(established similarly to equation (|64p ) and 

Qi^K^n), , _ Vp[pcl>p-l{x,-)Q'-"~P\l)] 



-{x) = 



,7pQ(»-p)(l) 



we have the decomposition 
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where 



(1) 



For the difference in (|65p . under (H) we have 

1 



77pQ("-p)(l) 



< 



< 



v^Kp (1) 

^^<p (1) 

ll^ll 



/ Vp-i{dy)q {y, x') J 7]p-i{dy)q (y, x') 



{e~f V^K,p (1) 



and therefore by Proposition H] and q{y,x') < e"' 



Jqiy, x') Vp-iidy) J g {y, x') Vp-iidy) 

q{y,x') [iip^iidy) ~ 7]^_i{dy)] 
q{y,x') [r]p^i{dy) - iip_i{dy)] 



V^^R^pidx') 
V^^RZpidx') 



E 



TV 



1/r 



Br 



iT-wrj < 2 11.11 ^ p 



For the difference in due to the relation 

Vp-i{dx)Q {x, dx') = $ (77f_i) (dx') i?;^ (x', dx) 
we have the telescoping decomposition 



E 

rn—p 



[g("-'")(i)i?,^,, [v:'<^p-i (x, .)]] $ [Q("-™)(i)i?,^,, [^</.p-i {x, •)]] 



,y^[Q("-™)(i)i?^^^(l)] 



Each term in the summation (|69p is of the form 



where 



r,^[Q("-™)i?^,,(l)] 

'i'O?^-!) [Q("-™)(l)i?^.p(l)] 

* (^yil-i) [Q("-™Hi)i?,^,p (1)] * {v^.-i) (1)] 
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rj TQ'-""'"^ (1)1^1 

Defining the map ^f^^^ -.V by „(?7)(^) := , for Ae B,we have 

riQy"^"'-' [1) 



sup 



< sup 



Q(»-rn)(l)(^) 

7<i>(^„^-i) [Q(— )(i)] 



X sup 

x,y 



RZ,, b0p-i •)] (2/) 



<p (!)(?/) 



['i' ('7,^-l)] [^^,p (1)] ^m,n [$ [i?,^,, (1)] ^m.n [$ [^^,p (1)] 



where the inequaUty is due to Lemma [U the bound of and then Lemma [7] and Proposition [3] apphed 
to the sequence of kernels R^, . . . , with = 4'm,n (''?m-i)] i ^nd p is as in Theorem [2j 

Then returni ng to (IM1)-([7D1). a nd noting that A^^Xmiy) is measurable w.r.t. to J-m-i, we have by an 



application of iDel Moral 



2004. Lemma 7.3.3.1 



E 



N 



1 1/^ 



Br /e 



5 n 



Br (t 



N \t- J yjN \e- J l-p 

ni—p V \ / 



(71) 



where the bound of Proposition [3] in equation ([55)) has been applied to the left factor in (jTO)) . 
It remains to consider T^^{x), and we do so using the decomposition: 



-^p,n (■^)| 



* [^<l>P-i ■) Q'""^^ (1)] Tip W^v-i ix, •) Q(""P) (1)] 



< lly'll 



$ «_i) Q("-fHi) %Q("-fHi) 



ll^il 



r;p_iQ("-P+i)(l) 
Now note that due to Lemma [T] and the bound of 

Q[(/)p_i(x,.)Q("-p)(i)] iy) 



(72) 



sup 



77p_iQ(«-P+i)(l) 



< sup |0p_i(a;, a; )| sup 



Q(«-P+I)(l)(y) 



< 



(73) 



and the same bound holds with ^ in place of ijp-i- Then Proposition |4] combined with (j73p may be 
applied to each of the terms in ([7^ to yield: 



E 



N 



'N \e- 

Combining ([BS]). ([7T|) and ([7^ completes the proof. 

Theorem 3. For any r > I there is a universal constant Br such that for any n > 1 and N > 1, 



sup Eat 

x<£X 



(74) 

□ 

(75) 
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sup sup Eat 
xeXAeB 



P[n,2n) ^) ~ ^* ^) 



< 4^C— + Cpp" 



(76) 



where C is as in Proposition\^ and p, Ch, Cp are as in Proposition]^ 
Proof. Consider the decomposition 



n+1 (\H-l,2n) i^) Q (^ri+l,2n) (x) 



(77) 



The first difference on the r.h.s. of (j77p is dealt with using Proposition [5] applied with = 1. For the 
other difference, we have that by Proposition (TJ 



sup sup \hn.2n{x) - K(x)\ < Chp" . 
n>Q xGX 



To prove (|76p . consider the decomposition: 



where 



P, 



(x, A) - p. (x, A) = E'l.ix, A) + E'l.ix, A) + E„,,ix, A) 



(78) 



^n,l{x, A) 



QniKanWix) Q{K,2nlA){x) 



Q{hn.2nlA){x) 



K-1 



hn-l,2ni.x) K-l,2n{x) 



^n,3{x,A) := Pl^n,2n){x,A) - Pi,{x,A). 



(79) 

(80) 
(81) 



For the first term, 



E 



N 



1/r 



< — E 

e 



< 2^1^C, 



QniK2jA)ix) QiKanWix) 



where the first inequality uses a lower bound derived from Lemma [SI 



''■n-l,2n(^) 



Q^1.2J1)W 
''ln-lQn~l,27ii^) 



> 



and the second inequality is due to Proposition [S] applied with ip ^ Ia- 
We also have 



E 



TV 



^n.2ix, A)\ 



^ e+ Qihn,2nlA){x) ^ 
e~ Xn~lhn-l,2n{x) 

e+ Br 



l/r 



\K-l,2n{x) - /l„_i.2„(a^) 



l/r 



< 2- 



e- VN 



C, 



(82) 



where for the first inequality (|82p has been used and the second inequality is due Lemma|4]and Proposition 
[S] applied with ip ~ I. 

The term S„_3 is dealt with using Proposition [T] and that completes the proof. □ 
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7 Examples 



7.1 A model of neutron transport 

The purpose of this first example is to illustrate how the eig en-quantities a rise in a discrete time branching 



process model for one-dimensional neutron transport - see [Harrisl . 119631 Chapter III] for details. In this 
case X = [0, L] for some L < oo and the kernel Q is the conditional first moment measure of the branching 
process. Our objective is to compute the eigen-quantities, which have certain probabilistic interpretations 
in this setting. 

Let J/a : X — > be a B-measurable function. Given that the current generation contains a neutron 
located at a; G X, that neutron evolves in a conditionally independent manner as follows. With probability 
1 — cxTp[^Us{x)], the neutron is absorbed and has no ancestors in the next generation. Otherwise, 
the neutron moves left or right with equal probability, by an exponentially distributed distance with 
parameter c. If the neutron then lands outside X it has no ancestors in the next generation. If it lands at 
a location on X (modelling the occurrence of fission) two co-located neutrons enter the next generation 
at that location. The conditional first moment measure is given by 

exp[—Us{x)] / ccxp {~c\y " x\) dy, A^BjXdX, 

J A 

where dy is Lebesgue measure. For 6 > we consider the double-well absorption potential given by 

Us{x) := 6 [l0-^{2x/3 - O.bf - 300(2x/3 - 0.5)"^ + 24(2a;/3 - 0.5)^ + 28/5(2a-/3 - 0.5)] . 
We take the potential function G and Markov kernel M defining the kernel Q as: 



Gix) := e-^^(") 



M{x,dy) 



(83) 



It is straightforward to check that (H ) is then satis fied. In the case that 5 = 0, there is no absorption 
and we recover exactly the model of iHarrisI |l963l . Section III, 8.2]. In the case that A^, < 1 (resp. 



= 1, A^ > 1) the model is called sub-critical (resp. critical, super-critical). 



7.1.1 Interpretation of /)* and 

The re-scaled kernel A^^Q may be interpreted as corresponding to a new branching model with a ab- 
sorption/fission mechanism which is adjusted so as to be critical. Then writing Ej; for expectation with 
respect to the law of this new branching process initialized from a single neutron located at x, and writ- 
ing Ain for the total number of neutrons in the nth generation, the eigen- function h^, has the following 
interpretation: 

[Mn] = A-"Q(")(1)(x) — > h^{x), as 71 ^ oo,Vx e X, 

where the first equality follows from the definition of the conditional moment measure and the convergence 
is just an implication of Theorem [1] equation (pS)) . 

If we replace G with by G{x) := G{x)/2, we recover an absorbing Markov chain (i.e. fission-less) with 
a soft obstacle on [0,L] and a hard obstacle outside of [0,L]. For full details of this sort of absorbing 



Del Moral and Doucet 



200J]. Writing {Xn;n > 0) for the absorbing chain. 



Markov chain model see 

for its law initialized from x, T for the absorption time and Q{x,dy) :~ G{x)M{x,dy), we have the 
interpretation 

P.(r>n) = Q'"+'Vl)(a;), n > 0, 
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Figure 1: One-dimensional neutron model. Left pane: dashed, exp [— Vi(a;)]; solid, approximations of /i^,, 
o,S — 0; X , (5 = 0.5; □, (5 = 1; *, (5 = 2; o, (5 = 5. Right pane: sampled trajectories from the twisted kernel 
approximations, top (5 = 5, bottom S = 0.5. 



and thus by the Markov property. 



e A|T>n) 



M 



M 



A€B. 



(84) 



By very similar arguments to those used in the proof of Proposition [T] one may then deduce 



e A\T>n) = 



A.Q("+i)(l)(x)/A? 



ri+l 



Pi,{x,A), as n-> ooyx eX,A e B, 



where the first equality is just obtained from (|84)) by suitable multiplications. Thus the twisted kernel 
specifies the transition probabilities of the process conditional on long-run survival. 



7.1.2 Numerics 

We set L — tt/2 and c = 1. The algorithm was run with N = 250 and n = 1000 for various values of 6. 
In the current and the subsequent examples we will set the initial distribution fi as a Dirac mass at 0. 
The approximations of h^, shown on the left of Figure [T] were obtained from the algorithm by evaluating 
the window-averaged quantities 

m — 1 

-E^»+p.2J^) (85) 

for m = 100 and x on a grid of 150 points evenly spaced on [0, L]. One can straightforwardly generalize 
Theorem [3] to treat estimates of the form ([55)) . for example when m is made to be a slowly increasing 
function of n. The influence of the parameter S is apparent in the figure. When 6 = the eigen-function 
is given by hi,{x) cx sin(x) + cos (x) and estimates obtained from several runs of the algorithm were 
indistinguishable from the truth on the scale of this figure. 

The right pane of Figure [1] shows trajectories of length 1000 obtained by sampling from P^2n (Oi ')> 
then P^j^^i 2n 6tc. (for more discussion of this kind of sampling see Section [775)) . The top plot {5 = 5) 
exhibits occasional "switching" across the wells of the potential function, with in the bottom plot {6 = 0.5) 
the switching occurring more frequently. 
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7.2 Solving a Bellman equation 



In this example the eigen-quantities arise from the solution of a particular optimality equation. Denote 
"H := {/i ; X ^ M+; sup^ y h{x)/h{y) < oo}. For any h E H, denote the corresponding "twisting" of M 
by: 

M{x,dy)h{y) 



Mh{x,dy) : 
and denote the Kullback Leibler divergence 

KL{ m4m) (x) := 



M{h)ix) 



i^r I J M dMhix,-) 

For some [/ e £ our objective is to compute a solution (14,^^,) of the Bellman average-cost optimality 
equation: 

(86) 



V^x) + ^^= inf [Uix) + KL A/) (x) + Kh (K) {x)] , 

where is the optimal value function and t^^, is the infinite horizon optimal average cost, see 



Hernandez-Lerma 



1983 . Chapter 3] for background. A thorough treatment of the underlying control problem is beyond the 



scope this article, but we note that this is an infinite horizon counter-part of certain problems studied by 



Albertini and Runggaldier 



1988 . 



Mitter and Newton 



20031 



amongst others. The interpretation of 



is that M specifies the "natural" or control free dynamics of the state of some stochastic system, ft, is a 
control function and the controlled state evolves according to the dynamics specified by Mh- The term 
U{x) expresses a state dependent stage cost and the KL (-Af;i|| M) (x) penalizes the discrepancy between 
Mh{x, •) and M{x, ■). To connect with our general setup, we take 



G{x) := exp [-Uix)] 



(87) 



and then as usual 



Qix,dy) := G{x)M{x,dy). 



7.2.1 Interpretation of /i* and 

Proposition 6. The Bellman equation Ii86\} is satisfied with 



V^{x) = -\oghi,{x), 
= - log A* 

where A,t, are the principal eigen-pair corresponding to Q given by Furthermore the infimum 

in H86]} is achieved by taking h ~ hi, and the corresponding controlled dynamics evolve according to 



(89) 



Proof. We first note for any h € H and a function V such that the integrals are well-defined, by Jensen's 
inequality we trivially have 



log / Mhix,dy) 



exp(-y(y)) 

Hy) 



Mh{x,dy) log 



exp(-V(y)) 



h{y) 



> 0. 



(90) 



Re- arranging (|90p . using the definition of Mh, adding U{x) and on both sides and taking the infimum 
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we obtain 



- logG(a;) - logM (e-^) (x) < inf {U{x) + Ah{\ogh){x) - \ogM{h){x) + MkiV){x)] (91) 
^ ' hen 

= inf {U{x) + KL (M,,|| M) (x) + M,,(F)(x)} . 

Thus a solution of the Bellman equation (j86p is found if equality in (pij) can be achieved by some V = V^ 
and which satisfy 

K(a;) = -logG(x) -logM(e-^*) [x). (92) 

Now, equality in ((90|) and thus also in ((9T|) is achieved when h = e^^ and ((92)) may be regarded as a 
re- writing of the eigen- function equation for Q when i;^, = — log A*. Therefore choosing h = h^,, setting 
14 = — log /i^ and substituting into (|?T|). we find that ([55)) is satisfied. For (|M)) we just notice that 

□ 

7.2.2 Numerics 

The Cox-Ingersoll-Ross process satisfies 

dXt = 0{ji- Xt) dt + as/xldWi, 

where {Wt} is standard one-dimensional Brownian motion, 6* > is the reversion rate, /i > is the 
level of mean reversion and cr > specifies the volatility. In financial applications this process is widely 
used to model interest rates. When 29^ > it is stationary. Here X = and for purposes of 
illustration we consider the case that M is the transition probab ility from time t = to t = 0.01 of the 



CIR process, which is available in closed form |Cox et al.l . Il985| . Although known to satisfy a type of 



multipli cative Lyapunov drift condition which allows an MET to be established in a weighted oo-norm 



setting [Whitelev et al.l . l2012l |. M cannot satisfy (H). Truncation (and suitable re- normalization) of M 
to any bounded interval of X does allow (H) to be satisfied. In our numerical experiments this truncation 
was made to [0, 500]. We took the parameter settings 6* = 2, cr = 20, = 10 and considered, for a range 
of 5, the following "well-shaped" cost function: 

U{x) = 2I[oao-5](a;) + I[io+5,oo) (a;), (93) 

which penalises states outside (10 — 5, 10 -I- (5). 

Figure [5| shows estimates of the value function, obtained via averaging as per ([55| with iV = 500, 
n = 2000 and to = 1000 and evaluations on a fine grid from a; = 4 to a; = 20. Note the coincidence 
of the discontinuities in (|93p with those in the estimated function. The influence of the parameter E is 
apparent. Table [T] shows the empirical relative variance (variance over the square of the mean) of the 
estimated value function evaluations at different points x and for different numbers of particles A'^. The 
variance evidently decreases with N ^ with large values associated with more extreme values of x. 

7.3 Importance Sampling for Rare Events 

In this example the eigen-quantities arise from a rare-event estimation problem. Let (A„;n > 0) be a 
Markov chain with transition kernel M initialized from some Xq — x, and write P^; for its law and E^, 
for the corresponding expectation . For [/ : X — > [— 1, 1] a measurable function which is not constant 
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Figure 2: Estimated optimal value function Vi,{x) against x for various parameter values: o, (5 = 5; x , (5 = 
4;n,5 = 3;+,(5 = 2. 



N 


X 


6 


8 


10 


12 


14 


16 


50 


1.81 X 10"^ 


1.94 X 10-^ 


5.62 X 10-^ 


7.27 X IQ-'^ 


1.07 X 10-^ 


7.2 X 10-^ 


100 


1.02 X 10-^ 


9.13 X lO-t* 


2.78 X 10^^ 


3.26 X 10-5 


5.41 X 10-* 


6.15 X 10-'^ 


500 


1.15 X 10--* 


4.95 X 10-*^ 


1.46 X 10-*^ 


5.75 X lG-« 


3.08 X 10-'5 


2.28 X 10-^ 



Table 1: Empirical relative variance of value function evaluations (at different x), with n ~ 2000 from 
500 independent realizations of the algorithm. 



ly — a.e., some S G (0, 1) and m > 1, our objective is to estimate the deviation probability 



TTrniS) := P. 



J2 U (Xp) > mS 



(94) 



Th ere is a quite extensive literature on methods for estimating probabilities of the form (|94|) [see for exam- 
ple I Buckle w et al.l . Il990l iDupuis and Wang j_ 2005 L building upon large deviat i on the ory for functionals 



Iscoe et al 



1985 



Nev and Nummelinl . 11987^ being particularly 



of Markov chains, with the results in 
relevant in the pres ent context. We will explore an importance sampling scenario in the setting of 



Bucklew et al 



1990j. The choice of this setup and specific form of 7r,„ [S) provides some insight into the 
applicability of the proposed algorithm, but many of the details could easily be generalized. 
For a € M we will consider 



G„(a;) := e"^(^\ 



(x, dx'):= Ga {x)M (x, dx') , 



and note that when (H) holds for Qa, the Markov kernel M is uniformly recurrent. 

For some Markov kernel M which dominates Af, and i > 1, the importance sampling estimator of 

Hra{5) is 

1 ^ 



i=l 



.p=l 



{Xq, ■■■,X'„^) 



(95) 



where { [Xq, XI, ;i = l,...,L}is composed by L independent Markov chains, each with transition 
kernel M and obeying a law denoted by P^.. The corresponding expectation will be denoted below by 
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Ej.. Note that the dependence of 7f,„ {S, L) on Af is suppressed. 
7.3.1 Interpretation of h^, and 

We will denote by h", A^(q;), ri",P" the eigen-quantities and twisted kernel corresponding to Qa, and 
we emphasize that from Theorem [TJ 



A* (a) ^ lim - logE^; 

n— ^oo Jl 



cxp U^[/(Xp) 

\ P=0 J 



The convex dual of A* (a) is 



l(t) :~ sup [ta — Ai,{a)] , t G 



(96) 



Following 



Bucklew et al 



19901 Definition 2.] we will consider a class of Markov transitions which define 
the importance sampling simulation distribution. 

Let C be the collection of Markov transitions M for each of which there exists < e~ , e"*" < oo and a 
probability measure D such that 

(C) D{-)e- <M{x,-) <e^i'i-),yx, v^v, 



dv 
dv 



(.t) ) V (dx) < oo, 



where is as in (H). 

The following result describes the asymptotic (m — oo) behaviour of the probability of interest and 
the variance of importance sampling estimators. 



Theorem 4. 



Bucklew et ali . \l99(Jl] 



1. I{t) is a non- negative, strictly convex function with I{t) = if and only li t = A^(0). 

2. For any S E (0, 1), the following large deviation principle holds 



lim — log tt™ ((5) = - inf I{t). 

m-¥oo m te[(5,oo) 



3. For any S G (0, 1) and M in the class C, the importance sampling estimator satisfies 

1 — r 9i 
lim — logE;^; 5f„((5, 1) > -2 inf I{t). 

m-i-oo ml i te[5,oo) 



(97) 



4. For any S e (0, 1) and a the unique solution of A^ (a) = S, the twisted kernel is the unique 
member of the class C for which equality holds in (|97p , and as such is called asymptotically efficient. 

The following elementary corollary summarizes an important practical implication of this theorem. As 
the quantities 7r,„ (S) typically vanish as m — > oo, it is perhaps not surprising that L needs to be scaled 
up with m in order to obtain a reliable estimate, but the twisted kernel with a solving A^ (a) = S has a 
special role. 

Corollary 1. Assume inf^gj^ oo) I{t) 7^ 0. The number of samples L must increase at a strictly positive 
exponential rate in m in order to control the relative variance 



7r,„((5) 



1 
L 



TTm [5, ly 

7r™(5)2 
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if an only if M chosen from C is not the asymptotically efficient P" . Note that 7f,„ {S,L) is unbiased, so 
198\} is indeed the relative variance. 



Proof , (of Theorem 0]) We just point to the appropriate references. Parts 1. -3. are due to Bucklew et al 



1990l Theorem 1 and Corol lary 1], in turn derived from various results of 



Bucklew et al 



Iscoe et al. 



igsj Equation 



199CI is satisfied triv ially in the present scenario since I{t) is continuous. Part 



Bucklew et al. 



1990 . Theorem 3]. We note that the authors there consider the 



(9) in 

4. is an application of 

kernel M {x, dy) Ga (y), as opposed to Ga {x) M {x, dy), this difference is of no consequence due to the 
asymptotic (m — > oo) nature of the results and the fact that the two corresponding twisted kernels are 
essentially identical. □ 



In the case of choosing Al = P" for simulation, 



becomes 



i=l 



E 

.p=i 



U (X;) > mS 



exp[mA.(a)] (X^) 



(99) 



As per the following proposition, it is in fact possible to achieve unbiased estimates using the twisted 
kernel approximations to define a conditional simulation distribution, and using an estimator which 
mimics the form of (j99p . 

Proposition 7. Fix aG R, iV>l,7i>l, m<n and x G X arbitrarily. Conditional on T2n, le^t 
(^Xp;p ~ 0,...,to) be a non-homogeneous Markov chain with transitions 



Xq = X, 



X„ 



^(n+p,2n) ■) , P > 1, 



(100) 



where ^^(^+p 2ri)) ^''^ obtained from the algorithm with with Q = Qa- Then for any 5 £ (0, 1) and with x 
taken in the definition of TT,n{S), writing E^r for expectation w.r.t. the joint law of {Xpj and the particle 
system, 



E 



N 



Y.U{Xp)> 
.p=l 



\N 



= 7r,„((5). 



(101) 



In these displays, the dependence of , h^^, P(^^p 2n) o,n-d^N on a is suppressed. 

Remark 11. A detailed study of the variance associated with the estimator in (|10ip is desirable, but 
beyond the scope of the present work. However, we do know that for any fixed to, n and this variance 
is finite, since (H) holding for Qa implies infa; Ga{x) > and by very similar arguments to those used to 
establish equation (|6ip in the proof of Theorem [51 it can be checked that the other quantities appearing 
inside the expectation (jlOip are uniformly bounded above and below away from zero. We also provide 
some numerical results below. 

Remark 12. The same result holds with (|100p and (|10ip generalized to the case where (A"p) is initialized 
at any time point in {0, 2n}. 

Proof, (of Proposition [7]) For simplicity of presentation, we will write 

m 

F {xi..,n) :=! ^U{xp)>mS 
.p=i 
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Then from (|100p and the definition of P^^^^p 2n) ' ^^^^ G X, 



E 



AT 



= E 



E 



m-l \N l,N fv \ 



■AT 



= E 



■AT 



E 



N 



■ rn ^ 

~ m 



W(f> ^iJV \ '-•^P'' 'In+p \""^P) 
V 'n+p-1/ 



(102) 



where J-2„ is the cr-algebra generated by the particle system at time 2n. We will proceed to decompose 
the difference between (|102p and 7r„i ((5). 
For i = 1, ...,m, define Fi by 

F,n{xi;,n) ■= F {Xl:,n) , F^ (xi-j) := / (a;i:f +i ) Af (cCf , dx^+i ) , ^=f,...,m-l, 

Jx 



and observe that then 

M (Fi) (a;) E, [F = (5) . 

For any ^ = 0, m, and xq G X , define 

"'X^ p=l l^n+p-lj 

Then for any £ = 2, ...,m, 
Ejv 



(103) 



i-^p) Vn+p {dxp) , £—!,.. 



(104) 



Fi {xo) \ Fn+e-i 



n 



^ Y [Xp) Vn+p [dxp) En 

'n+p-l) 



X ('7„+^-i) 



dM (xp^i,-) ^ 



n 



i^p) Vn+p (dXp) / Fi{xi:e) M {xi-i,dXi) 



e-1 

/ F,_l (Xl:,_l) TT 

•^x^-^ pil 



rf* ('?^+p-l) 

and a similar manipulation shows 



(105) 



E 



N 



Fi (xo) Fq (xo) 



(106) 



We then have that 



— iV V — ^ — iV — iV 

-P'm(2;o) -7r,„(5) 2^EAr F^ (xq ) - F^_ ^ (xq ) 

1=1 
= 0, 



where p03p . p05p . (|106p and (|104p have been applied. But F^(xo) is just what appears inside the 
expectation (|102p . so the proof is complete. □ 
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5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 

Figure 3: Left: estimated value of 7Tm{S) against m, for: o,S = 0.8; D,S = 0.9, and +, i5 = 0.99. Right: 
solid lines show sample relative variance of the estimated value of 7rm(0.9) against m using the conditional 
simulation method with: o, a = 1: +, a = 2; a = 4; □, a = 8; and x, a = 16. Dashed line shows 
sample relative variance of 7f,„ (0.9, 1) in the case M = M. 



7.3.2 Numerics 

For some c > we take X = [— c, c] and consider an ergodic Gaussian transition kernel with support 
restricted to [— c, c]. 



M{x,dy) 



exp 



y 



erf 



-x/2 
V2 



erf 



(^)) 



2tt 



]iy)dy, 



and consider U defined by 



U{x) 




For any a S M, assumption (H) holds. The left plot in Figure [3] shows estimated values of tt„i {S), obtained 
from the algorithm with N — 250, n = 500, a = 6 and using the estimator which appears inside the 
expectation in Proposition [71 i.e. a single sample of the conditional Markov chain. The displayed results 
are the averages over 2000 realizations of this entire procedure. The exponential decay rate predicted 
by the large deviation principle (Theorem 2] Part 2.) is apparent. The sample relative variances in the 
case of (5 = 0.9 are shown on the right of [3j for different values of a. The sample relative variance of 
7r,„ (0.9, 1) for the trivial case M = M is also included for reference, and explodes rapidly with m. 

On a very fine grid of a-values, approximations of (a) as per (|16p were obtained with the same 
settings of N and n. These were used to obtain the approximations of [at — Aj,(a)] against a plotted on 
the left of Figure m and an approximation of A^a) was obtained by finite differences, the result is shown 
on the right of Figure 01 The latter plot suggests A^ (10) w 0.9, and bearing in mind the optimality 
result of Theorem m part 4., we then notice in the relative variance plots of Figure [3] that the slowest 
growth (amongst the a values considered) occurs with a ~ 8. 



8 Extensions 

There are a number of possible avenues for further investigation: 
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Figure 4: Left: each of the sohd curves shows an approximation of [at — A^,(a)] against a, with each 
curve corresponding to a different vahie of t in the range [—0.8,0.8]. The cross on each curve indicates 
its maximum and thus approximates the value of sup^, [at — Ai,{a)] = I{t). Right: A^a) against a 
approximated using finite differences. 



Assumption (H) is very restrictive whe n X is non-compact. Starting points fo r the analysis of the 



method under weaker assumptions are [Whitelev 



20111 



Whitelev et al.l . l2012l | , where the stability 



of Feynman-Kac semigroups and particle approximations have been studied under a relaxation of 
the uniform majorization/minorization structure of (H), using a Lyapunov drift condition. 

• As per standard Perron- Frobenius theory, we have not made any reversibility assumptions, and 
this is reflected to some extent in the "forward-backward" structure of the algorithm. It would be 
interesting to address the case that the Markov kernel M is reversible, both in terms of analysis 
and algorithmic structure. 

• The control problem underlying the Bellman equation in Section 17.21 has not received much math- 
ematical attention and could be investigated further. 

• The connection to optimal importance sampling schemes for rare event simulation and estimation 
warrants further study. In particular a study of the variance of the estimator appearing in Proposi- 
tion [7] and propagation of chaos properties associated with blocks of samples drawn from (^P^ n)j 
is desira ble. Furtherm ore, it is of some interest to investigate how optimization schemes such as 
those in 



Kant as 



20091 Chapter 5] could be combined with the algorithm in order to estimate the 



solution of A^(a) = S. 
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